3,209 research outputs found

    A discriminative prototype selection approach for graph embedding in human action recognition

    Full text link
    This paper proposes a novel graph-based method for representing a human's shape during the performance of an action. Despite their strong representational power graphs are computationally cumbersome for pattern analysis. One way of circumventing this problem is that of transforming the graphs into a vector space by means of graph embedding. Such an embedding can be conveniently obtained by way of a set of prototype graphs and a dissimilarity measure: yet the critical step in this approach is the selection of a suitable set of prototypes which can capture both the salient structure within each action class as well as the intra-class variation. This paper proposes a new discriminative approach for the selection of prototypes which maximizes a function of the inter-and intra-class distances. Experiments on an action recognition dataset reported in the paper show that such a discriminative approach outperforms well-established prototype selection methods such as center border and random prototype selection. © 2011 IEEE

    Copula mixed-membership stochastic block model

    Full text link
    The Mixed-Membership Stochastic Blockmodels (MMSB) is a popular framework for modelling social relationships by fully exploiting each individual node's participation (or membership) in a social network. Despite its powerful representations, MMSB assumes that the membership indicators of each pair of nodes (i.e., people) are distributed independently. However, such an assumption often does not hold in real-life social networks, in which certain known groups of people may correlate with each other in terms of factors such as their membership categories. To expand MMSB's ability to model such dependent relationships, a new framework - a Copula Mixed-Membership Stochastic Blockmodel - is introduced in this paper for modeling intra-group correlations, namely an individual Copula function jointly models the membership pairs of those nodes within the group of interest. This framework enables various Copula functions to be used on demand, while maintaining the membership indicator's marginal distribution needed for modelling membership indicators with other nodes outside of the group of interest. Sampling algorithms for both the finite and infinite number of groups are also detailed. Our experimental results show its superior performance in capturing group interactions when compared with the baseline models on both synthetic and real world datasets

    Infinite Author Topic Model based on Mixed Gamma-Negative Binomial Process.

    Full text link
    Incorporating the side information of text corpus, i.e., authors, time stamps, and emotional tags, into the traditional text mining models has gained significant interests in the area of information retrieval, statistical natural language processing, and machine learning. One branch of these works is the so-called Author Topic Model (ATM), which incorporates the authors's interests as side information into the classical topic model. However, the existing ATM needs to predefine the number of topics, which is difficult and inappropriate in many real-world settings. In this paper, we propose an Infinite Author Topic (IAT) model to resolve this issue. Instead of assigning a discrete probability on fixed number of topics, we use a stochastic process to determine the number of topics from the data itself. To be specific, we extend a gamma-negative binomial process to three levels in order to capture the author-document-keyword hierarchical structure. Furthermore, each document is assigned a mixed gamma process that accounts for the multi-author's contribution towards this document. An efficient Gibbs sampling inference algorithm with each conditional distribution being closed-form is developed for the IAT model. Experiments on several real-world datasets show the capabilities of our IAT model to learn the hidden topics, authors' interests on these topics and the number of topics simultaneously.Comment: 10 pages, 5 figures, submitted to KDD conferenc

    A Bayesian nonparametric model for multi-label learning

    Full text link
    © 2017, The Author(s). Multi-label learning has become a significant learning paradigm in the past few years due to its broad application scenarios and the ever-increasing number of techniques developed by researchers in this area. Among existing state-of-the-art works, generative statistical models are characterized by their good generalization ability and robustness on large number of labels through learning a low-dimensional label embedding. However, one issue of this branch of models is that the number of dimensions needs to be fixed in advance, which is difficult and inappropriate in many real-world settings. In this paper, we propose a Bayesian nonparametric model to resolve this issue. More specifically, we extend a Gamma-negative binomial process to three levels in order to capture the label-instance-feature structure. Furthermore, a mixing strategy for Gamma processes is designed to account for the multiple labels of an instance. The mixed process also leads to a difficulty in model inference, so an efficient Gibbs sampling inference algorithm is then developed to resolve this difficulty. Experiments on several real-world datasets show the performance of the proposed model on multi-label learning tasks, comparing with three state-of-the-art models from the literature

    Bayesian Nonparametric Relational Topic Model through Dependent Gamma Processes

    Full text link
    © 2016 IEEE. Traditional relational topic models provide a successful way to discover the hidden topics from a document network. Many theoretical and practical tasks, such as dimensional reduction, document clustering, and link prediction, could benefit from this revealed knowledge. However, existing relational topic models are based on an assumption that the number of hidden topics is known a priori, which is impractical in many real-world applications. Therefore, in order to relax this assumption, we propose a nonparametric relational topic model using stochastic processes instead of fixed-dimensional probability distributions in this paper. Specifically, each document is assigned a Gamma process, which represents the topic interest of this document. Although this method provides an elegant solution, it brings additional challenges when mathematically modeling the inherent network structure of typical document network, i.e., two spatially closer documents tend to have more similar topics. Furthermore, we require that the topics are shared by all the documents. In order to resolve these challenges, we use a subsampling strategy to assign each document a different Gamma process from the global Gamma process, and the subsampling probabilities of documents are assigned with a Markov Random Field constraint that inherits the document network structure. Through the designed posterior inference algorithm, we can discover the hidden topics and its number simultaneously. Experimental results on both synthetic and real-world network datasets demonstrate the capabilities of learning the hidden topics and, more importantly, the number of topics

    A perceptual quality metric for 3D triangle meshes based on spatial pooling

    Full text link
    © 2018, Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature. In computer graphics, various processing operations are applied to 3D triangle meshes and these processes often involve distortions, which affect the visual quality of surface geometry. In this context, perceptual quality assessment of 3D triangle meshes has become a crucial issue. In this paper, we propose a new objective quality metric for assessing the visual difference between a reference mesh and a corresponding distorted mesh. Our analysis indicates that the overall quality of a distorted mesh is sensitive to the distortion distribution. The proposed metric is based on a spatial pooling strategy and statistical descriptors of the distortion distribution. We generate a perceptual distortion map for vertices in the reference mesh while taking into account the visual masking effect of the human visual system. The proposed metric extracts statistical descriptors from the distortion map as the feature vector to represent the overall mesh quality. With the feature vector as input, we adopt a support vector regression model to predict the mesh quality score.We validate the performance of our method with three publicly available databases, and the comparison with state-of-the-art metrics demonstrates the superiority of our method. Experimental results show that our proposed method achieves a high correlation between objective assessment and subjective scores

    A new mesh visual quality metric using saliency weighting-based pooling strategy

    Full text link
    © 2018 Elsevier Inc. Several metrics have been proposed to assess the visual quality of 3D triangular meshes during the last decade. In this paper, we propose a mesh visual quality metric by integrating mesh saliency into mesh visual quality assessment. We use the Tensor-based Perceptual Distance Measure metric to estimate the local distortions for the mesh, and pool local distortions into a quality score using a saliency weighting-based pooling strategy. Three well-known mesh saliency detection methods are used to demonstrate the superiority and effectiveness of our metric. Experimental results show that our metric with any of three saliency maps performs better than state-of-the-art metrics on the LIRIS/EPFL general-purpose database. We generate a synthetic saliency map by assembling salient regions from individual saliency maps. Experimental results reveal that the synthetic saliency map achieves better performance than individual saliency maps, and the performance gain is closely correlated with the similarity between the individual saliency maps

    On the Neural Tangent Kernel of Deep Networks with Orthogonal Initialization

    Full text link
    The prevailing thinking is that orthogonal weights are crucial to enforcing dynamical isometry and speeding up training. The increase in learning speed that results from orthogonal initialization in linear networks has been well-proven. However, while the same is believed to also hold for nonlinear networks when the dynamical isometry condition is satisfied, the training dynamics behind this contention have not been thoroughly explored. In this work, we study the dynamics of ultra-wide networks across a range of architectures, including Fully Connected Networks (FCNs) and Convolutional Neural Networks (CNNs) with orthogonal initialization via neural tangent kernel (NTK). Through a series of propositions and lemmas, we prove that two NTKs, one corresponding to Gaussian weights and one to orthogonal weights, are equal when the network width is infinite. Further, during training, the NTK of an orthogonally-initialized infinite-width network should theoretically remain constant. This suggests that the orthogonal initialization cannot speed up training in the NTK (lazy training) regime, contrary to the prevailing thoughts. In order to explore under what circumstances can orthogonality accelerate training, we conduct a thorough empirical investigation outside the NTK regime. We find that when the hyper-parameters are set to achieve a linear regime in nonlinear activation, orthogonal initialization can improve the learning speed with a large learning rate or large depth.</jats:p

    Dependence testing and vectorization of multimedia agents

    Full text link
    We present a dependence testing algorithm that considers the short width of modern SIMD registers in a typical microprocessor. The test works by solving the dependence system with the generalized GCD algorithm and then simplifying the solution equations for a particular set of dependence distances. We start by simplifying each solution lattice to generate points that satisfy some small constant dependence distance that corresponds to the width of the register being used. We use the Power Test to efficiently perform Fourier-Motzkin Variable Elimination on the simplified systems in order to determine if dependences exist. The improvements described in this paper also extend our SIMD dependence test to loops with symbolic and triangular lower and upper bounds as well as array indices that contain unknown symbolic additive constants. The resulting analysis is used to guide the vectorization pass of a dynamic multimedia compiler used to compile software agents that process audio, video and image data. We fully detail the proposed dependence test in this paper, including the related work

    Image denoising based on nonlocal Bayesian singular value thresholding and Stein's unbiased risk estimator

    Full text link
    © 1992-2012 IEEE. Singular value thresholding (SVT)- or nuclear norm minimization (NNM)-based nonlocal image denoising methods often rely on the precise estimation of the noise variance. However, most existing methods either assume that the noise variance is known or require an extra step to estimate it. Under the iterative regularization framework, the error in the noise variance estimate propagates and accumulates with each iteration, ultimately degrading the overall denoising performance. In addition, the essence of these methods is still least squares estimation, which can cause a very high mean-squared error (MSE) and is inadequate for handling missing data or outliers. In order to address these deficiencies, we present a hybrid denoising model based on variational Bayesian inference and Stein's unbiased risk estimator (SURE), which consists of two complementary steps. In the first step, the variational Bayesian SVT performs a low-rank approximation of the nonlocal image patch matrix to simultaneously remove the noise and estimate the noise variance. In the second step, we modify the conventional SURE full-rank SVT and its divergence formulas for rank-reduced eigen-triplets to remove the residual artifacts. The proposed hybrid BSSVT method achieves better performance in recovering the true image compared with state-of-the-art methods
    • …
    corecore